Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[WIP] Revert "Remove cleanup on side threads (#19378)" #19432

Closed
wants to merge 1 commit into from

Conversation

barry-jin
Copy link
Contributor

@barry-jin barry-jin commented Oct 27, 2020

Description

An intermediate solution to #19420 to unblock GluonNLP CI.
This reverts commit 43750c8.

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@mxnet-bot
Copy link

Hey @barry-jin , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [centos-gpu, sanity, unix-gpu, website, clang, windows-cpu, edge, centos-cpu, miscellaneous, windows-gpu, unix-cpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@lanking520 lanking520 added the pr-awaiting-testing PR is reviewed and waiting CI build and test label Oct 27, 2020
@leezu leezu requested a review from ptrendx October 27, 2020 17:13
@ptrendx
Copy link
Member

ptrendx commented Oct 27, 2020

I fail to understand how is this a bugfix. The original commit was introduced as a workaround for #19360 and the issue was raised (#19379) that a better solution should be put in its place. Just reverting it is not such better solution.

Could you provide details on how this commit triggers the issue you are seeing?

@barry-jin
Copy link
Contributor Author

I fail to understand how is this a bugfix. The original commit was introduced as a workaround for #19360 and the issue was raised (#19379) that a better solution should be put in its place. Just reverting it is not such better solution.

Could you provide details on how this commit triggers the issue you are seeing?

Sorry for losing the context of PR #19378 . It's definitely not the best solution. But our current issue #19420 has blocked GluonNLP CI and GluonNLP development. So, this PR is an intermediate solution rather than a bugfix to have Cuda 11 broken on CD and unblock GluonNLP development. I will investigate more to find a better solution.

@barry-jin barry-jin changed the title [BUGFIX] Revert "Remove cleanup on side threads (#19378)" [WIP] Revert "Remove cleanup on side threads (#19378)" Oct 27, 2020
@lanking520 lanking520 added pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Oct 27, 2020
@sxjscience
Copy link
Member

Ping also @mseth10 . @ptrendx We are seeing the error described in this issue #19420 after upgrading to the latest MXNet.

@leezu
Copy link
Contributor

leezu commented Oct 27, 2020

@mxnet-bot run ci [windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-gpu]

@ptrendx
Copy link
Member

ptrendx commented Oct 27, 2020

@sxjscience Sure, I understand that there is an issue somewhere and I'm happy to help resolve it. That said, neither this PR nor the linked issue explains why the failure happens with that change enabled (I assume here that you verified that a commit just before #19378 passes). If this was about reverting a feature it would be fine, but here we are reverting a bugfix (or bug workaround to be more precise), so I think it makes sense to make sure we understand what is going on here before we reintroduce a problem.

@sxjscience
Copy link
Member

@ptrendx Yeah, I agree with that. I think we will need some time to understand what's happening underneath.

@barry-jin
Copy link
Contributor Author

@ptrendx Thanks. I will update the linked issue with more detailed information after my investigation.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-work-in-progress PR is still work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants